Overview

Dataset statistics

Number of variables13
Number of observations1586614
Missing cells68148
Missing cells (%)0.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory567.8 MiB
Average record size in memory375.3 B

Variable types

NUM9
CAT4

Reproduction

Analysis started2021-02-26 05:26:30.582051
Analysis finished2021-02-26 05:28:46.240468
Duration2 minutes and 15.66 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

brewery_name has a high cardinality: 5742 distinct values High cardinality
review_profilename has a high cardinality: 33387 distinct values High cardinality
beer_style has a high cardinality: 104 distinct values High cardinality
beer_name has a high cardinality: 56857 distinct values High cardinality
beer_abv has 67785 (4.3%) missing values Missing

Variables

brewery_id
Real number (ℝ≥0)

Distinct count5840
Unique (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3130.0992018222455
Minimum1
Maximum28003
Zeros0
Zeros (%)0.0%
Memory size12.1 MiB

Quantile statistics

Minimum1
5-th percentile30
Q1143
median429
Q32372
95-th percentile16866
Maximum28003
Range28002
Interquartile range (IQR)2229

Descriptive statistics

Standard deviation5578.103987
Coefficient of variation (CV)1.782085368
Kurtosis3.408354127
Mean3130.099202
Median Absolute Deviation (MAD)366
Skewness2.083747568
Sum4966259215
Variance31115244.1
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
35394442.5%
 
10099338392.1%
 
147330662.1%
 
140287511.8%
 
287251911.6%
 
132240831.5%
 
1199200041.3%
 
345194791.2%
 
220168371.1%
 
30161071.0%
 
3818158681.0%
 
29158151.0%
 
158149350.9%
 
1549145340.9%
 
26140390.9%
 
22139210.9%
 
45139020.9%
 
192134100.8%
 
392122480.8%
 
694118420.7%
 
68116970.7%
 
863113110.7%
 
112112800.7%
 
590111720.7%
 
73109430.7%
 
Other values (5815)113289671.4%
 
ValueCountFrequency (%) 
113570.1%
 
240< 0.1%
 
353570.3%
 
473210.5%
 
5728< 0.1%
 
6367< 0.1%
 
837750.2%
 
98970.1%
 
1056< 0.1%
 
1123< 0.1%
 
ValueCountFrequency (%) 
280032< 0.1%
 
280001< 0.1%
 
279841< 0.1%
 
279803< 0.1%
 
279451< 0.1%
 
279441< 0.1%
 
279344< 0.1%
 
279272< 0.1%
 
279223< 0.1%
 
279201< 0.1%
 

brewery_name
Categorical

HIGH CARDINALITY

Distinct count5742
Unique (%)0.4%
Missing15
Missing (%)< 0.1%
Memory size12.1 MiB
Boston Beer Company (Samuel Adams)
 
39444
Dogfish Head Brewery
 
33839
Stone Brewing Co.
 
33066
Sierra Nevada Brewing Co.
 
28751
Bell's Brewery, Inc.
 
25191
Other values (5737)
1426308
ValueCountFrequency (%) 
Boston Beer Company (Samuel Adams)394442.5%
 
Dogfish Head Brewery338392.1%
 
Stone Brewing Co.330662.1%
 
Sierra Nevada Brewing Co.287511.8%
 
Bell's Brewery, Inc.251911.6%
 
Rogue Ales240831.5%
 
Founders Brewing Company200041.3%
 
Victory Brewing Company194791.2%
 
Lagunitas Brewing Company168371.1%
 
Avery Brewing Company161071.0%
 
Southern Tier Brewing Company158681.0%
 
Anheuser-Busch158151.0%
 
Great Divide Brewing Company149350.9%
 
Goose Island Beer Co.145340.9%
 
Three Floyds Brewing Co. & Brewpub140390.9%
 
Unibroue139210.9%
 
Brooklyn Brewery139020.9%
 
New Belgium Brewing134100.8%
 
Weyerbacher Brewing Co.122480.8%
 
Tröegs Brewing Company118420.7%
 
Flying Dog Brewery116970.7%
 
Russian River Brewing Company113110.7%
 
North Coast Brewing Co.112800.7%
 
New Glarus Brewing Company111720.7%
 
Great Lakes Brewing Company109430.7%
 
Other values (5717)113288171.4%
 

Length

Max length66
Median length23
Mean length23.61012761
Min length3

Overview of Unicode Properties

Unique unicode characters132
Unique unicode categories (?)12
Unique unicode scripts (?)2
Unique unicode blocks (?)3
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e393997510.5%
 
393149810.5%
 
r31989788.5%
 
n23414656.3%
 
a20418735.5%
 
o20098635.4%
 
i19094565.1%
 
B17142104.6%
 
w13066273.5%
 
s11727553.1%
 
g11357123.0%
 
y11338583.0%
 
C9895352.6%
 
t9710152.6%
 
m9102082.4%
 
l9017262.4%
 
u8055572.2%
 
p7278461.9%
 
d5603631.5%
 
h4982531.3%
 
.4692571.3%
 
S4175021.1%
 
c3600061.0%
 
A3047600.8%
 
b2610670.7%
 
Other values (107)34467949.2%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter2699041072.1%
 
Uppercase Letter550847914.7%
 
Space Separator393150210.5%
 
Other Punctuation8096942.2%
 
Open Punctuation692720.2%
 
Close Punctuation692720.2%
 
Dash Punctuation568260.2%
 
Decimal Number228480.1%
 
Control855< 0.1%
 
Final Punctuation720< 0.1%
 
Math Symbol270< 0.1%
 
Modifier Symbol11< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
B171421031.1%
 
C98953518.0%
 
S4175027.6%
 
A3047605.5%
 
H1938683.5%
 
L1918393.5%
 
D1787463.2%
 
T1714163.1%
 
G1642193.0%
 
P1595442.9%
 
R1441502.6%
 
F1333912.4%
 
M1271012.3%
 
N1211882.2%
 
W997401.8%
 
V830801.5%
 
I823871.5%
 
O684201.2%
 
K533061.0%
 
E304090.6%
 
U295530.5%
 
J265900.5%
 
Y141780.3%
 
Z38280.1%
 
Ø34380.1%
 
Other values (12)2081< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e393997514.6%
 
r319897811.9%
 
n23414658.7%
 
a20418737.6%
 
o20098637.4%
 
i19094567.1%
 
w13066274.8%
 
s11727554.3%
 
g11357124.2%
 
y11338584.2%
 
t9710153.6%
 
m9102083.4%
 
l9017263.3%
 
u8055573.0%
 
p7278462.7%
 
d5603632.1%
 
h4982531.8%
 
c3600061.3%
 
b2610671.0%
 
k2404300.9%
 
v1877460.7%
 
f1206720.4%
 
j952590.4%
 
z614040.2%
 
ä242160.1%
 
Other values (32)740800.3%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
3931498> 99.9%
 
 4< 0.1%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
.46925758.0%
 
&11801714.6%
 
'9961012.3%
 
/630317.8%
 
,514566.4%
 
#40520.5%
 
;40000.5%
 
?144< 0.1%
 
"81< 0.1%
 
@46< 0.1%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
2683229.9%
 
1450119.7%
 
3334914.7%
 
817957.9%
 
516737.3%
 
714716.4%
 
614236.2%
 
47273.2%
 
05672.5%
 
95102.2%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-56826100.0%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(69246> 99.9%
 
[26< 0.1%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
)69246> 99.9%
 
]26< 0.1%
 

Most frequent Control characters

ValueCountFrequency (%) 
š64074.9%
 
Ž18021.1%
 
’242.8%
 
Š50.6%
 
ž20.2%
 
“20.2%
 
”20.2%
 

Most frequent Modifier Symbol characters

ValueCountFrequency (%) 
´763.6%
 
`436.4%
 

Most frequent Final Punctuation characters

ValueCountFrequency (%) 
720100.0%
 

Most frequent Math Symbol characters

ValueCountFrequency (%) 
+270100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin3249888986.8%
 
Common496127013.2%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e393997512.1%
 
r31989789.8%
 
n23414657.2%
 
a20418736.3%
 
o20098636.2%
 
i19094565.9%
 
B17142105.3%
 
w13066274.0%
 
s11727553.6%
 
g11357123.5%
 
y11338583.5%
 
C9895353.0%
 
t9710153.0%
 
m9102082.8%
 
l9017262.8%
 
u8055572.5%
 
p7278462.2%
 
d5603631.7%
 
h4982531.5%
 
S4175021.3%
 
c3600061.1%
 
A3047600.9%
 
b2610670.8%
 
k2404300.7%
 
H1938680.6%
 
Other values (69)24519817.5%
 

Most frequent Common characters

ValueCountFrequency (%) 
393149879.2%
 
.4692579.5%
 
&1180172.4%
 
'996102.0%
 
(692461.4%
 
)692461.4%
 
/630311.3%
 
-568261.1%
 
,514561.0%
 
268320.1%
 
145010.1%
 
#40520.1%
 
;40000.1%
 
333490.1%
 
81795< 0.1%
 
51673< 0.1%
 
71471< 0.1%
 
61423< 0.1%
 
4727< 0.1%
 
720< 0.1%
 
š640< 0.1%
 
0567< 0.1%
 
9510< 0.1%
 
+270< 0.1%
 
Ž180< 0.1%
 
Other values (13)373< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII3737679299.8%
 
None826470.2%
 
Punctuation720< 0.1%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e393997510.5%
 
393149810.5%
 
r31989788.6%
 
n23414656.3%
 
a20418735.5%
 
o20098635.4%
 
i19094565.1%
 
B17142104.6%
 
w13066273.5%
 
s11727553.1%
 
g11357123.0%
 
y11338583.0%
 
C9895352.6%
 
t9710152.6%
 
m9102082.4%
 
l9017262.4%
 
u8055572.2%
 
p7278461.9%
 
d5603631.5%
 
h4982531.3%
 
.4692571.3%
 
S4175021.1%
 
c3600061.0%
 
A3047600.8%
 
b2610670.7%
 
Other values (55)33634279.0%
 

Most frequent None characters

ValueCountFrequency (%) 
ä2421629.3%
 
ö1534018.6%
 
è80069.7%
 
ø62317.5%
 
é57537.0%
 
í44275.4%
 
ü40104.9%
 
Ø34384.2%
 
ô30093.6%
 
ý15381.9%
 
á12081.5%
 
š8711.1%
 
š6400.8%
 
à4620.6%
 
Ö3800.5%
 
å3770.5%
 
î3500.4%
 
Š2590.3%
 
ñ2100.3%
 
ó1960.2%
 
Ž1800.2%
 
Ž1800.2%
 
ê1610.2%
 
À1590.2%
 
ß1490.2%
 
Other values (26)8971.1%
 

Most frequent Punctuation characters

ValueCountFrequency (%) 
720100.0%
 

review_time
Real number (ℝ≥0)

Distinct count1577960
Unique (%)99.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1224089280.0122108
Minimum840672001
Maximum1326285348
Zeros0
Zeros (%)0.0%
Memory size12.1 MiB

Quantile statistics

Minimum840672001
5-th percentile1071431292
Q11173224188
median1239202882
Q31288568405
95-th percentile1318389924
Maximum1326285348
Range485613347
Interquartile range (IQR)115344217

Descriptive statistics

Standard deviation76544274.54
Coefficient of variation (CV)0.06253161088
Kurtosis-0.3136982976
Mean1224089280
Median Absolute Deviation (MAD)54219357.5
Skewness-0.7352727768
Sum1.942157189e+15
Variance5.859025965e+15
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
110177280021< 0.1%
 
10311012008< 0.1%
 
9263808018< 0.1%
 
8970912017< 0.1%
 
10221120017< 0.1%
 
9808128017< 0.1%
 
9048672016< 0.1%
 
9330336016< 0.1%
 
9029664016< 0.1%
 
9262944015< 0.1%
 
9847872015< 0.1%
 
9269856015< 0.1%
 
9695808015< 0.1%
 
10223748014< 0.1%
 
9267264014< 0.1%
 
8949312014< 0.1%
 
9336384014< 0.1%
 
12144398214< 0.1%
 
9697536014< 0.1%
 
9049536014< 0.1%
 
12473568004< 0.1%
 
10024992014< 0.1%
 
8878464014< 0.1%
 
9993888014< 0.1%
 
8952768013< 0.1%
 
Other values (1577935)1586471> 99.9%
 
ValueCountFrequency (%) 
8406720011< 0.1%
 
8843904011< 0.1%
 
8846496011< 0.1%
 
8853408011< 0.1%
 
8854272011< 0.1%
 
8859456011< 0.1%
 
8867232011< 0.1%
 
8870688012< 0.1%
 
8871552012< 0.1%
 
8874144012< 0.1%
 
ValueCountFrequency (%) 
13262853481< 0.1%
 
13262849701< 0.1%
 
13262766561< 0.1%
 
13262750491< 0.1%
 
13262744541< 0.1%
 
13262737101< 0.1%
 
13262736141< 0.1%
 
13262731061< 0.1%
 
13262724381< 0.1%
 
13262721071< 0.1%
 

review_overall
Real number (ℝ≥0)

Distinct count10
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.8155808533140387
Minimum0.0
Maximum5.0
Zeros7
Zeros (%)< 0.1%
Memory size12.1 MiB

Quantile statistics

Minimum0
5-th percentile2.5
Q13.5
median4
Q34.5
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7206218681
Coefficient of variation (CV)0.1888629532
Kurtosis1.631038958
Mean3.815580853
Median Absolute Deviation (MAD)0.5
Skewness-1.023968713
Sum6053854
Variance0.5192958767
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
458276436.7%
 
4.532438520.4%
 
3.530181719.0%
 
316564410.4%
 
5913205.8%
 
2.5585233.7%
 
2382252.4%
 
1.5129750.8%
 
1109540.7%
 
07< 0.1%
 
ValueCountFrequency (%) 
07< 0.1%
 
1109540.7%
 
1.5129750.8%
 
2382252.4%
 
2.5585233.7%
 
316564410.4%
 
3.530181719.0%
 
458276436.7%
 
4.532438520.4%
 
5913205.8%
 
ValueCountFrequency (%) 
5913205.8%
 
4.532438520.4%
 
458276436.7%
 
3.530181719.0%
 
316564410.4%
 
2.5585233.7%
 
2382252.4%
 
1.5129750.8%
 
1109540.7%
 
07< 0.1%
 

review_aroma
Real number (ℝ≥0)

Distinct count9
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.735636077836197
Minimum1.0
Maximum5.0
Zeros0
Zeros (%)0.0%
Memory size12.1 MiB

Quantile statistics

Minimum1
5-th percentile2.5
Q13.5
median4
Q34
95-th percentile4.5
Maximum5
Range4
Interquartile range (IQR)0.5

Descriptive statistics

Standard deviation0.6976167288
Coefficient of variation (CV)0.1867464374
Kurtosis1.145196752
Mean3.735636078
Median Absolute Deviation (MAD)0.5
Skewness-0.838530526
Sum5927012.5
Variance0.4866691003
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
455738335.1%
 
3.536531223.0%
 
4.527145017.1%
 
320003012.6%
 
2.5663594.2%
 
5641174.0%
 
2425662.7%
 
1.5125240.8%
 
168730.4%
 
ValueCountFrequency (%) 
168730.4%
 
1.5125240.8%
 
2425662.7%
 
2.5663594.2%
 
320003012.6%
 
3.536531223.0%
 
455738335.1%
 
4.527145017.1%
 
5641174.0%
 
ValueCountFrequency (%) 
5641174.0%
 
4.527145017.1%
 
455738335.1%
 
3.536531223.0%
 
320003012.6%
 
2.5663594.2%
 
2425662.7%
 
1.5125240.8%
 
168730.4%
 

review_appearance
Real number (ℝ≥0)

Distinct count10
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.8416416973504584
Minimum0.0
Maximum5.0
Zeros7
Zeros (%)< 0.1%
Memory size12.1 MiB

Quantile statistics

Minimum0
5-th percentile3
Q13.5
median4
Q34
95-th percentile4.5
Maximum5
Range5
Interquartile range (IQR)0.5

Descriptive statistics

Standard deviation0.6160927689
Coefficient of variation (CV)0.160372262
Kurtosis1.738866541
Mean3.841641697
Median Absolute Deviation (MAD)0.5
Skewness-0.9024199172
Sum6095202.5
Variance0.3795702999
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
467418642.5%
 
3.531852920.1%
 
4.528810818.2%
 
316600910.5%
 
5653984.1%
 
2.5394932.5%
 
2254141.6%
 
1.561470.4%
 
133230.2%
 
07< 0.1%
 
ValueCountFrequency (%) 
07< 0.1%
 
133230.2%
 
1.561470.4%
 
2254141.6%
 
2.5394932.5%
 
316600910.5%
 
3.531852920.1%
 
467418642.5%
 
4.528810818.2%
 
5653984.1%
 
ValueCountFrequency (%) 
5653984.1%
 
4.528810818.2%
 
467418642.5%
 
3.531852920.1%
 
316600910.5%
 
2.5394932.5%
 
2254141.6%
 
1.561470.4%
 
133230.2%
 
07< 0.1%
 

review_profilename
Categorical

HIGH CARDINALITY

Distinct count33387
Unique (%)2.1%
Missing348
Missing (%)< 0.1%
Memory size12.1 MiB
northyorksammy
 
5817
BuckeyeNation
 
4661
mikesgroove
 
4617
Thorpe429
 
3518
womencantsail
 
3497
Other values (33382)
1564156
ValueCountFrequency (%) 
northyorksammy58170.4%
 
BuckeyeNation46610.3%
 
mikesgroove46170.3%
 
Thorpe42935180.2%
 
womencantsail34970.2%
 
NeroFiddled34880.2%
 
ChainGangGuy34710.2%
 
brentk5633570.2%
 
Phyl21ca31790.2%
 
WesWes31680.2%
 
oberon31280.2%
 
feloniousmonk30810.2%
 
akorsak30100.2%
 
BEERchitect29460.2%
 
Gueuzedude29380.2%
 
jwc21527350.2%
 
russpowell26960.2%
 
TheManiacalOne26590.2%
 
Gavage26300.2%
 
zeff8026220.2%
 
Mora200025940.2%
 
tempest25590.2%
 
Wasatch25410.2%
 
WVbeergeek25240.2%
 
drabmuh24810.2%
 
Other values (33362)150634994.9%
 

Length

Max length16
Median length9
Mean length8.961438636
Min length3

Overview of Unicode Properties

Unique unicode characters63
Unique unicode categories (?)4
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e145633610.2%
 
a10594357.5%
 
r10131187.1%
 
o8667216.1%
 
n7647115.4%
 
i7183295.1%
 
t6312304.4%
 
s5989524.2%
 
l5706284.0%
 
d4452653.1%
 
h3969792.8%
 
m3963902.8%
 
c3955282.8%
 
u3842042.7%
 
b3562922.5%
 
g3263642.3%
 
k3030792.1%
 
y2808202.0%
 
p2362541.7%
 
w1799921.3%
 
11707281.2%
 
B1661791.2%
 
f1640451.2%
 
j1272890.9%
 
01262440.9%
 
Other values (38)208323214.7%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter1192305583.9%
 
Uppercase Letter13483209.5%
 
Decimal Number9468486.7%
 
Other Punctuation121< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e145633612.2%
 
a10594358.9%
 
r10131188.5%
 
o8667217.3%
 
n7647116.4%
 
i7183296.0%
 
t6312305.3%
 
s5989525.0%
 
l5706284.8%
 
d4452653.7%
 
h3969793.3%
 
m3963903.3%
 
c3955283.3%
 
u3842043.2%
 
b3562923.0%
 
g3263642.7%
 
k3030792.5%
 
y2808202.4%
 
p2362542.0%
 
w1799921.5%
 
f1640451.4%
 
j1272891.1%
 
v1064350.9%
 
z876300.7%
 
x440880.4%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
117072818.0%
 
012624413.3%
 
211773812.4%
 
710059910.6%
 
3872399.2%
 
8793088.4%
 
9749697.9%
 
5676257.1%
 
4673977.1%
 
6550015.8%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
B16617912.3%
 
S999057.4%
 
D922446.8%
 
M917176.8%
 
T817946.1%
 
C792785.9%
 
G729795.4%
 
J643534.8%
 
R641184.8%
 
A635614.7%
 
H573074.3%
 
P515163.8%
 
F446353.3%
 
W419133.1%
 
L408173.0%
 
N373902.8%
 
E347192.6%
 
K336372.5%
 
O323792.4%
 
I288752.1%
 
V222151.6%
 
Z150571.1%
 
U129641.0%
 
Y92370.7%
 
X54100.4%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
.121100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin1327137593.3%
 
Common9469696.7%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e145633611.0%
 
a10594358.0%
 
r10131187.6%
 
o8667216.5%
 
n7647115.8%
 
i7183295.4%
 
t6312304.8%
 
s5989524.5%
 
l5706284.3%
 
d4452653.4%
 
h3969793.0%
 
m3963903.0%
 
c3955283.0%
 
u3842042.9%
 
b3562922.7%
 
g3263642.5%
 
k3030792.3%
 
y2808202.1%
 
p2362541.8%
 
w1799921.4%
 
B1661791.3%
 
f1640451.2%
 
j1272891.0%
 
v1064350.8%
 
S999050.8%
 
Other values (27)12268959.2%
 

Most frequent Common characters

ValueCountFrequency (%) 
117072818.0%
 
012624413.3%
 
211773812.4%
 
710059910.6%
 
3872399.2%
 
8793088.4%
 
9749697.9%
 
5676257.1%
 
4673977.1%
 
6550015.8%
 
.121< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII14218344100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e145633610.2%
 
a10594357.5%
 
r10131187.1%
 
o8667216.1%
 
n7647115.4%
 
i7183295.1%
 
t6312304.4%
 
s5989524.2%
 
l5706284.0%
 
d4452653.1%
 
h3969792.8%
 
m3963902.8%
 
c3955282.8%
 
u3842042.7%
 
b3562922.5%
 
g3263642.3%
 
k3030792.1%
 
y2808202.0%
 
p2362541.7%
 
w1799921.3%
 
11707281.2%
 
B1661791.2%
 
f1640451.2%
 
j1272890.9%
 
01262440.9%
 
Other values (38)208323214.7%
 

beer_style
Categorical

HIGH CARDINALITY

Distinct count104
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size12.1 MiB
American IPA
 
117586
American Double / Imperial IPA
 
85977
American Pale Ale (APA)
 
63469
Russian Imperial Stout
 
54129
American Double / Imperial Stout
 
50705
Other values (99)
1214748
ValueCountFrequency (%) 
American IPA1175867.4%
 
American Double / Imperial IPA859775.4%
 
American Pale Ale (APA)634694.0%
 
Russian Imperial Stout541293.4%
 
American Double / Imperial Stout507053.2%
 
American Porter504773.2%
 
American Amber / Red Ale457512.9%
 
Belgian Strong Dark Ale377432.4%
 
Fruit / Vegetable Beer338612.1%
 
American Strong Ale319452.0%
 
Belgian Strong Pale Ale314902.0%
 
Saison / Farmhouse Ale314802.0%
 
American Adjunct Lager307491.9%
 
Tripel303281.9%
 
Witbier301401.9%
 
Hefeweizen279081.8%
 
American Barleywine267281.7%
 
American Brown Ale252971.6%
 
American Stout245381.5%
 
American Pale Wheat Ale242041.5%
 
Märzen / Oktoberfest235231.5%
 
English Pale Ale234021.5%
 
German Pilsener221551.4%
 
Doppelbock216991.4%
 
Winter Warmer206611.3%
 
Other values (79)62066939.1%
 

Length

Max length35
Median length18
Mean length17.86997972
Min length4

Overview of Unicode Properties

Unique unicode characters58
Unique unicode categories (?)7
Unique unicode scripts (?)2
Unique unicode blocks (?)2
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e338549411.9%
 
310920511.0%
 
r20641807.3%
 
a18577646.6%
 
i17407126.1%
 
l16957376.0%
 
A16117415.7%
 
n14890425.3%
 
m10457193.7%
 
t10093763.6%
 
c9165303.2%
 
o8570233.0%
 
u6699882.4%
 
P6302832.2%
 
g5147741.8%
 
I4646701.6%
 
S4448241.6%
 
s4243351.5%
 
b4207901.5%
 
B3857381.4%
 
/3756481.3%
 
p3446431.2%
 
h2850341.0%
 
D2686230.9%
 
d2649570.9%
 
Other values (33)20759307.3%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter1970071169.5%
 
Uppercase Letter492332217.4%
 
Space Separator310920511.0%
 
Other Punctuation3780061.3%
 
Open Punctuation1147260.4%
 
Close Punctuation1147260.4%
 
Dash Punctuation12064< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
A161174132.7%
 
P63028312.8%
 
I4646709.4%
 
S4448249.0%
 
B3857387.8%
 
D2686235.5%
 
E1763373.6%
 
L1511183.1%
 
W1489723.0%
 
R1398362.8%
 
F945311.9%
 
H771591.6%
 
M700501.4%
 
O613661.2%
 
V428150.9%
 
G383300.8%
 
Q361720.7%
 
T326860.7%
 
C292860.6%
 
K135340.3%
 
Z25910.1%
 
J1546< 0.1%
 
U1114< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e338549417.2%
 
r206418010.5%
 
a18577649.4%
 
i17407128.8%
 
l16957378.6%
 
n14890427.6%
 
m10457195.3%
 
t10093765.1%
 
c9165304.7%
 
o8570234.4%
 
u6699883.4%
 
g5147742.6%
 
s4243352.2%
 
b4207902.1%
 
p3446431.7%
 
h2850341.4%
 
d2649571.3%
 
k2059331.0%
 
w1530200.8%
 
z987440.5%
 
y806250.4%
 
f554690.3%
 
j307490.2%
 
x276240.1%
 
ä235230.1%
 
Other values (4)389260.2%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
3109205100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
/37564899.4%
 
&23580.6%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(114726100.0%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
)114726100.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-12064100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin2462403386.8%
 
Common372872713.2%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e338549413.7%
 
r20641808.4%
 
a18577647.5%
 
i17407127.1%
 
l16957376.9%
 
A16117416.5%
 
n14890426.0%
 
m10457194.2%
 
t10093764.1%
 
c9165303.7%
 
o8570233.5%
 
u6699882.7%
 
P6302832.6%
 
g5147742.1%
 
I4646701.9%
 
S4448241.8%
 
s4243351.7%
 
b4207901.7%
 
B3857381.6%
 
p3446431.4%
 
h2850341.2%
 
D2686231.1%
 
d2649571.1%
 
k2059330.8%
 
E1763370.7%
 
Other values (27)14497865.9%
 

Most frequent Common characters

ValueCountFrequency (%) 
310920583.4%
 
/37564810.1%
 
(1147263.1%
 
)1147263.1%
 
-120640.3%
 
&23580.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2831197499.9%
 
None407860.1%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e338549412.0%
 
310920511.0%
 
r20641807.3%
 
a18577646.6%
 
i17407126.1%
 
l16957376.0%
 
A16117415.7%
 
n14890425.3%
 
m10457193.7%
 
t10093763.6%
 
c9165303.2%
 
o8570233.0%
 
u6699882.4%
 
P6302832.2%
 
g5147741.8%
 
I4646701.6%
 
S4448241.6%
 
s4243351.5%
 
b4207901.5%
 
B3857381.4%
 
/3756481.3%
 
p3446431.2%
 
h2850341.0%
 
D2686230.9%
 
d2649570.9%
 
Other values (30)20351447.2%
 

Most frequent None characters

ValueCountFrequency (%) 
ä2352357.7%
 
è882121.6%
 
ö844220.7%
 

review_palate
Real number (ℝ≥0)

Distinct count9
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.7437013665579655
Minimum1.0
Maximum5.0
Zeros0
Zeros (%)0.0%
Memory size12.1 MiB

Quantile statistics

Minimum1
5-th percentile2.5
Q13.5
median4
Q34
95-th percentile4.5
Maximum5
Range4
Interquartile range (IQR)0.5

Descriptive statistics

Standard deviation0.6822183634
Coefficient of variation (CV)0.1822309785
Kurtosis1.303397287
Mean3.743701367
Median Absolute Deviation (MAD)0.5
Skewness-0.8691499712
Sum5939809
Variance0.4654218953
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
460671138.2%
 
3.533858521.3%
 
4.525310216.0%
 
320693213.0%
 
2.5628424.0%
 
5621903.9%
 
2383332.4%
 
1.5110450.7%
 
168740.4%
 
ValueCountFrequency (%) 
168740.4%
 
1.5110450.7%
 
2383332.4%
 
2.5628424.0%
 
320693213.0%
 
3.533858521.3%
 
460671138.2%
 
4.525310216.0%
 
5621903.9%
 
ValueCountFrequency (%) 
5621903.9%
 
4.525310216.0%
 
460671138.2%
 
3.533858521.3%
 
320693213.0%
 
2.5628424.0%
 
2383332.4%
 
1.5110450.7%
 
168740.4%
 

review_taste
Real number (ℝ≥0)

Distinct count9
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.792860456292457
Minimum1.0
Maximum5.0
Zeros0
Zeros (%)0.0%
Memory size12.1 MiB

Quantile statistics

Minimum1
5-th percentile2.5
Q13.5
median4
Q34.5
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7319696099
Coefficient of variation (CV)0.1929861692
Kurtosis1.341669306
Mean3.792860456
Median Absolute Deviation (MAD)0.5
Skewness-0.9734324438
Sum6017805.5
Variance0.5357795098
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
454142934.1%
 
4.533616221.2%
 
3.532454120.5%
 
316686010.5%
 
5839775.3%
 
2.5665344.2%
 
2419922.6%
 
1.5151281.0%
 
199910.6%
 
ValueCountFrequency (%) 
199910.6%
 
1.5151281.0%
 
2419922.6%
 
2.5665344.2%
 
316686010.5%
 
3.532454120.5%
 
454142934.1%
 
4.533616221.2%
 
5839775.3%
 
ValueCountFrequency (%) 
5839775.3%
 
4.533616221.2%
 
454142934.1%
 
3.532454120.5%
 
316686010.5%
 
2.5665344.2%
 
2419922.6%
 
1.5151281.0%
 
199910.6%
 

beer_name
Categorical

HIGH CARDINALITY

Distinct count56857
Unique (%)3.6%
Missing0
Missing (%)0.0%
Memory size12.1 MiB
90 Minute IPA
 
3290
India Pale Ale
 
3130
Old Rasputin Russian Imperial Stout
 
3111
Sierra Nevada Celebration Ale
 
3000
Two Hearted Ale
 
2728
Other values (56852)
1571355
ValueCountFrequency (%) 
90 Minute IPA32900.2%
 
India Pale Ale31300.2%
 
Old Rasputin Russian Imperial Stout31110.2%
 
Sierra Nevada Celebration Ale30000.2%
 
Two Hearted Ale27280.2%
 
Arrogant Bastard Ale27040.2%
 
Stone Ruination IPA27040.2%
 
Sierra Nevada Pale Ale25870.2%
 
Stone IPA (India Pale Ale)25750.2%
 
Pliny The Elder25270.2%
 
Founders Breakfast Stout25020.2%
 
Pale Ale25000.2%
 
Sierra Nevada Bigfoot Barleywine Style Ale24920.2%
 
La Fin Du Monde24830.2%
 
60 Minute IPA24750.2%
 
Storm King Stout24520.2%
 
Duvel24500.2%
 
Brooklyn Black Chocolate Stout24470.2%
 
Bell's Hopslam Ale24430.2%
 
Samuel Adams Boston Lager24180.2%
 
Stone Imperial Russian Stout23290.1%
 
HopDevil Ale23020.1%
 
Chocolate Stout22920.1%
 
Imperial Stout22730.1%
 
Young's Double Chocolate Stout22570.1%
 
Other values (56832)152214395.9%
 

Length

Max length75
Median length19
Mean length20.45317513
Min length1

Overview of Unicode Properties

Unique unicode characters190
Unique unicode categories (?)19
Unique unicode scripts (?)5
Unique unicode blocks (?)6
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
354547310.9%
 
e352708110.9%
 
a20553896.3%
 
r20351126.3%
 
l18172515.6%
 
o16024954.9%
 
i15505594.8%
 
t15402084.7%
 
n13936514.3%
 
s10874523.4%
 
u8928272.8%
 
A7715992.4%
 
S6636092.0%
 
d6567802.0%
 
m5988201.8%
 
h5792551.8%
 
B5681151.8%
 
c5122671.6%
 
P4696121.4%
 
p4657011.4%
 
g4342481.3%
 
k3926831.2%
 
b3479731.1%
 
y3303691.0%
 
I2926130.9%
 
Other values (165)432015213.3%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter2257629869.6%
 
Uppercase Letter537368516.6%
 
Space Separator354549210.9%
 
Decimal Number3376781.0%
 
Other Punctuation2940360.9%
 
Close Punctuation1119390.3%
 
Open Punctuation1119370.3%
 
Dash Punctuation955250.3%
 
Other Symbol2455< 0.1%
 
Control959< 0.1%
 
Math Symbol545< 0.1%
 
Other Letter266< 0.1%
 
Final Punctuation235< 0.1%
 
Other Number174< 0.1%
 
Currency Symbol44< 0.1%
 
Modifier Symbol11< 0.1%
 
Modifier Letter8< 0.1%
 
Initial Punctuation6< 0.1%
 
Format1< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
A77159914.4%
 
S66360912.3%
 
B56811510.6%
 
P4696128.7%
 
I2926135.4%
 
D2535284.7%
 
H2467814.6%
 
C2437994.5%
 
W2006963.7%
 
L1986303.7%
 
R1945973.6%
 
T1935403.6%
 
O1837333.4%
 
M1730283.2%
 
F1411952.6%
 
G1299522.4%
 
E1050012.0%
 
N965741.8%
 
K734551.4%
 
J446270.8%
 
V387070.7%
 
U281800.5%
 
Y234800.4%
 
X187870.3%
 
Q74910.1%
 
Other values (25)123560.2%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e352708115.6%
 
a20553899.1%
 
r20351129.0%
 
l18172518.0%
 
o16024957.1%
 
i15505596.9%
 
t15402086.8%
 
n13936516.2%
 
s10874524.8%
 
u8928274.0%
 
d6567802.9%
 
m5988202.7%
 
h5792552.6%
 
c5122672.3%
 
p4657012.1%
 
g4342481.9%
 
k3926831.7%
 
b3479731.5%
 
y3303691.5%
 
w1781900.8%
 
v1776290.8%
 
f1733910.8%
 
z789010.3%
 
x439580.2%
 
é195500.1%
 
Other values (43)845580.4%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
3545473> 99.9%
 
 19< 0.1%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
010040529.7%
 
16383018.9%
 
24905614.5%
 
8225446.7%
 
5214386.3%
 
9199025.9%
 
3183205.4%
 
4156044.6%
 
6141884.2%
 
7123913.7%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(11140799.5%
 
[5300.5%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
)11140999.5%
 
]5300.5%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
'17149358.3%
 
.6317321.5%
 
#141624.8%
 
&123144.2%
 
/119324.1%
 
"64792.2%
 
,41491.4%
 
:30781.0%
 
!28701.0%
 
%25370.9%
 
*5940.2%
 
?4490.2%
 
;3990.1%
 
§3840.1%
 
¿23< 0.1%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-95519> 99.9%
 
6< 0.1%
 

Most frequent Initial Punctuation characters

ValueCountFrequency (%) 
350.0%
 
233.3%
 
«116.7%
 

Most frequent Other Symbol characters

ValueCountFrequency (%) 
°2455100.0%
 

Most frequent Control characters

ValueCountFrequency (%) 
Ž36237.7%
 
’31633.0%
 
ž15416.1%
 
š9810.2%
 
Š181.9%
 
–60.6%
 
œ30.3%
 
‘20.2%
 

Most frequent Final Punctuation characters

ValueCountFrequency (%) 
23198.3%
 
31.3%
 
»10.4%
 

Most frequent Math Symbol characters

ValueCountFrequency (%) 
+53998.9%
 
=40.7%
 
~20.4%
 

Most frequent Other Letter characters

ValueCountFrequency (%) 
22986.1%
 
º269.8%
 
20.8%
 
10.4%
 
10.4%
 
10.4%
 
10.4%
 
10.4%
 
10.4%
 
10.4%
 
10.4%
 
10.4%
 

Most frequent Modifier Symbol characters

ValueCountFrequency (%) 
´1090.9%
 
^19.1%
 

Most frequent Other Number characters

ValueCountFrequency (%) 
³13879.3%
 
½3520.1%
 
²10.6%
 

Most frequent Modifier Letter characters

ValueCountFrequency (%) 
ʼ787.5%
 
112.5%
 

Most frequent Currency Symbol characters

ValueCountFrequency (%) 
$44100.0%
 

Most frequent Format characters

ValueCountFrequency (%) 
1100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin2795000686.1%
 
Common450104513.9%
 
Han234< 0.1%
 
Katakana6< 0.1%
 
Greek3< 0.1%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e352708112.6%
 
a20553897.4%
 
r20351127.3%
 
l18172516.5%
 
o16024955.7%
 
i15505595.5%
 
t15402085.5%
 
n13936515.0%
 
s10874523.9%
 
u8928273.2%
 
A7715992.8%
 
S6636092.4%
 
d6567802.3%
 
m5988202.1%
 
h5792552.1%
 
B5681152.0%
 
c5122671.8%
 
P4696121.7%
 
p4657011.7%
 
g4342481.6%
 
k3926831.4%
 
b3479731.2%
 
y3303691.2%
 
I2926131.0%
 
D2535280.9%
 
Other values (93)311080911.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
354547378.8%
 
'1714933.8%
 
)1114092.5%
 
(1114072.5%
 
01004052.2%
 
-955192.1%
 
1638301.4%
 
.631731.4%
 
2490561.1%
 
8225440.5%
 
5214380.5%
 
9199020.4%
 
3183200.4%
 
4156040.3%
 
6141880.3%
 
#141620.3%
 
7123910.3%
 
&123140.3%
 
/119320.3%
 
"64790.1%
 
,41490.1%
 
:30780.1%
 
!28700.1%
 
%25370.1%
 
°24550.1%
 
Other values (35)49170.1%
 

Most frequent Han characters

ValueCountFrequency (%) 
22997.9%
 
10.4%
 
10.4%
 
10.4%
 
10.4%
 
10.4%
 

Most frequent Katakana characters

ValueCountFrequency (%) 
233.3%
 
116.7%
 
116.7%
 
116.7%
 
116.7%
 

Most frequent Greek characters

ValueCountFrequency (%) 
Ω3100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII3236603899.7%
 
None847620.3%
 
Punctuation246< 0.1%
 
CJK234< 0.1%
 
Katakana7< 0.1%
 
Modifier Letters7< 0.1%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
354547311.0%
 
e352708110.9%
 
a20553896.4%
 
r20351126.3%
 
l18172515.6%
 
o16024955.0%
 
i15505594.8%
 
t15402084.8%
 
n13936514.3%
 
s10874523.4%
 
u8928272.8%
 
A7715992.4%
 
S6636092.1%
 
d6567802.0%
 
m5988201.9%
 
h5792551.8%
 
B5681151.8%
 
c5122671.6%
 
P4696121.5%
 
p4657011.4%
 
g4342481.3%
 
k3926831.2%
 
b3479731.1%
 
y3303691.0%
 
I2926130.9%
 
Other values (61)423489613.1%
 

Most frequent None characters

ValueCountFrequency (%) 
é1955023.1%
 
ö1482217.5%
 
ä1188114.0%
 
è75198.9%
 
ü61477.3%
 
ë32163.8%
 
ô30553.6%
 
°24552.9%
 
É23062.7%
 
ê18702.2%
 
Ü14091.7%
 
ø12181.4%
 
á11791.4%
 
å8141.0%
 
í8091.0%
 
Ø7860.9%
 
æ6020.7%
 
ó4370.5%
 
š4210.5%
 
§3840.5%
 
Ž3620.4%
 
ã3300.4%
 
’3160.4%
 
ý3010.4%
 
Ô2160.3%
 
Other values (60)23572.8%
 

Most frequent Punctuation characters

ValueCountFrequency (%) 
23193.9%
 
62.4%
 
31.2%
 
31.2%
 
20.8%
 
10.4%
 

Most frequent CJK characters

ValueCountFrequency (%) 
22997.9%
 
10.4%
 
10.4%
 
10.4%
 
10.4%
 
10.4%
 

Most frequent Katakana characters

ValueCountFrequency (%) 
228.6%
 
114.3%
 
114.3%
 
114.3%
 
114.3%
 
114.3%
 

Most frequent Modifier Letters characters

ValueCountFrequency (%) 
ʼ7100.0%
 

beer_abv
Real number (ℝ≥0)

MISSING

Distinct count530
Unique (%)< 0.1%
Missing67785
Missing (%)4.3%
Infinite0
Infinite (%)0.0%
Mean7.0423867532158
Minimum0.01
Maximum57.7
Zeros0
Zeros (%)0.0%
Memory size12.1 MiB

Quantile statistics

Minimum0.01
5-th percentile4.5
Q15.2
median6.5
Q38.5
95-th percentile11
Maximum57.7
Range57.69
Interquartile range (IQR)3.3

Descriptive statistics

Standard deviation2.322525993
Coefficient of variation (CV)0.3297924516
Kurtosis6.961811545
Mean7.042386753
Median Absolute Deviation (MAD)1.5
Skewness1.543406148
Sum10696181.23
Variance5.394126987
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
51091446.9%
 
8677444.3%
 
6653834.1%
 
7594603.7%
 
9591833.7%
 
5.5590103.7%
 
10547803.5%
 
6.5483693.0%
 
5.2432682.7%
 
7.5399782.5%
 
8.5384742.4%
 
5.4331232.1%
 
4.9309141.9%
 
9.5299991.9%
 
4.5293581.9%
 
5.3290851.8%
 
5.6289131.8%
 
5.8285711.8%
 
4.8270971.7%
 
5.1251221.6%
 
6.2235871.5%
 
7.2235401.5%
 
5.9232321.5%
 
11231911.5%
 
10.5211281.3%
 
Other values (505)49717631.3%
 
(Missing)677854.3%
 
ValueCountFrequency (%) 
0.015< 0.1%
 
0.0517< 0.1%
 
0.081< 0.1%
 
0.111< 0.1%
 
0.253< 0.1%
 
0.378< 0.1%
 
0.474< 0.1%
 
0.4555< 0.1%
 
0.5779< 0.1%
 
0.73< 0.1%
 
ValueCountFrequency (%) 
57.71< 0.1%
 
432< 0.1%
 
4176< 0.1%
 
39.443< 0.1%
 
397< 0.1%
 
3288< 0.1%
 
30.861< 0.1%
 
2916< 0.1%
 
283< 0.1%
 
27355< 0.1%
 

beer_beerid
Real number (ℝ≥0)

Distinct count66055
Unique (%)4.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21712.79427888573
Minimum3
Maximum77317
Zeros0
Zeros (%)0.0%
Memory size12.1 MiB

Quantile statistics

Minimum3
5-th percentile213
Q11717
median13906
Q339441
95-th percentile62653
Maximum77317
Range77314
Interquartile range (IQR)37724

Descriptive statistics

Standard deviation21818.336
Coefficient of variation (CV)1.004860808
Kurtosis-0.8339342225
Mean21712.79428
Median Absolute Deviation (MAD)13217
Skewness0.6893969312
Sum3.444982338e+10
Variance476039785.7
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
209332900.2%
 
41231110.2%
 
190430000.2%
 
109327280.2%
 
9227040.2%
 
408327040.2%
 
27625870.2%
 
8825750.2%
 
797125270.2%
 
1175725020.2%
 
267124920.2%
 
3424830.2%
 
610824750.2%
 
101324520.2%
 
69524500.2%
 
68024470.2%
 
1711224430.2%
 
10424180.2%
 
116023290.1%
 
100523020.1%
 
7322570.1%
 
35522340.1%
 
170822170.1%
 
75422100.1%
 
64521700.1%
 
Other values (66030)152350796.0%
 
ValueCountFrequency (%) 
33< 0.1%
 
410< 0.1%
 
5424< 0.1%
 
68770.1%
 
7659< 0.1%
 
868< 0.1%
 
9116< 0.1%
 
10717< 0.1%
 
1185< 0.1%
 
1286< 0.1%
 
ValueCountFrequency (%) 
773171< 0.1%
 
773161< 0.1%
 
773151< 0.1%
 
773141< 0.1%
 
773131< 0.1%
 
773121< 0.1%
 
773101< 0.1%
 
773091< 0.1%
 
773081< 0.1%
 
773071< 0.1%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

Sample

First rows

brewery_idbrewery_namereview_timereview_overallreview_aromareview_appearancereview_profilenamebeer_stylereview_palatereview_tastebeer_namebeer_abvbeer_beerid
010325Vecchio Birraio12348178231.52.02.5stculesHefeweizen1.51.5Sausa Weizen5.047986
110325Vecchio Birraio12359150973.02.53.0stculesEnglish Strong Ale3.03.0Red Moon6.248213
210325Vecchio Birraio12359166043.02.53.0stculesForeign / Export Stout3.03.0Black Horse Black Beer6.548215
310325Vecchio Birraio12347251453.03.03.5stculesGerman Pilsener2.53.0Sausa Pils5.047969
41075Caldera Brewing Company12937352064.04.54.0johnmichaelsenAmerican Double / Imperial IPA4.04.5Cauldron DIPA7.764883
51075Caldera Brewing Company13255246593.03.53.5oline73Herbed / Spiced Beer3.03.5Caldera Ginger Beer4.752159
61075Caldera Brewing Company13189911153.53.53.5ReidroverHerbed / Spiced Beer4.04.0Caldera Ginger Beer4.752159
71075Caldera Brewing Company13062760183.02.53.5alpinebryantHerbed / Spiced Beer2.03.5Caldera Ginger Beer4.752159
81075Caldera Brewing Company12904545034.03.03.5LordAdmNelsonHerbed / Spiced Beer3.54.0Caldera Ginger Beer4.752159
91075Caldera Brewing Company12856329244.53.55.0augustgarageHerbed / Spiced Beer4.04.0Caldera Ginger Beer4.752159

Last rows

brewery_idbrewery_namereview_timereview_overallreview_aromareview_appearancereview_profilenamebeer_stylereview_palatereview_tastebeer_namebeer_abvbeer_beerid
158660414359The Defiant Brewing Company12888902064.04.54.5njmoonsPumpkin Ale3.53.5The Horseman's Ale5.233061
158660514359The Defiant Brewing Company11632911435.05.05.0NyackNickyPumpkin Ale5.05.0The Horseman's Ale5.233061
158660614359The Defiant Brewing Company11628718085.04.54.0blitheringidiotPumpkin Ale5.05.0The Horseman's Ale5.233061
158660714359The Defiant Brewing Company11628656405.05.04.5PopeDXPumpkin Ale5.04.5The Horseman's Ale5.233061
158660814359The Defiant Brewing Company11626858563.54.04.0treehugger02010Pumpkin Ale3.53.0The Horseman's Ale5.233061
158660914359The Defiant Brewing Company11626848925.04.03.5maddogrussPumpkin Ale4.04.0The Horseman's Ale5.233061
158661014359The Defiant Brewing Company11610485664.05.02.5yelterdowPumpkin Ale2.04.0The Horseman's Ale5.233061
158661114359The Defiant Brewing Company11607025134.53.53.0TongoRadPumpkin Ale3.54.0The Horseman's Ale5.233061
158661214359The Defiant Brewing Company11600230444.04.54.5dherlingPumpkin Ale4.54.5The Horseman's Ale5.233061
158661314359The Defiant Brewing Company11600053195.04.54.5cbl2Pumpkin Ale4.54.5The Horseman's Ale5.233061